OverFlow: Multi-Site Aware Big Data Management for Scientific Workflows on Clouds
نویسندگان
چکیده
منابع مشابه
A Data-aware Partitioning and Optimization Method for Large-scale Scientific Workflows in Hybrid Clouds
While hybrid cloud computing environments provide good potential for achieving high performance and low economic cost, it also introduces a broad set of unpredictable overheads especially for running data-intensive applications. This paper describes a novel approach which refines workflow structures and optimizes intermediate data transfers for largescale scientific workflows containing thousan...
متن کاملMulti-objective scheduling of Scientific Workflows in multisite clouds
Clouds appear as appropriate infrastructures for executing Scientific Workflows (SWfs). A cloud is typically made of several sites (or data centers), each with its own resources and data. Thus, it becomes important to be able to execute some SWfs at more than one cloud site because of the geographical distribution of data or available resources among different cloud sites. Therefore, a major pr...
متن کاملData Locality-Aware Big Data Query Evaluation in Distributed Clouds
With more and more businesses and organizations outsourcing their IT services to distributed clouds for cost savings, historical and operational data generated by the services have been growing exponentially. The generated data that are referred to as big data, stored at different geographic datacenters, now become an invaluable asset to these businesses and organizations, as they can make use ...
متن کاملThe Need for Resilience Research in Workflows of Big Compute and Big Data Scientific Applications
Projections and reports about exascale failure modes conclude that we need to protect numerical simulations and data analytics from an increasing risk of hardware and software failures and silent data corruptions (SDC) [1, 4]. At this scale, hardware and software failures could be as frequent as ten or more per day. According to [9], the semiconductor industry will have increased difficulty pre...
متن کاملEfficient Management of Geographically Distributed Big Data on Clouds
Nowadays cloud infrastructures allow storing and processing increasing amounts of scientific data. However, most of the existing large scale data management frameworks are based on the assumption that users deploy their data-intensive applications in single data center, few of them focus on the inter data centers data flows. Managing data across geographically distributed data centers is not tr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Cloud Computing
سال: 2016
ISSN: 2168-7161
DOI: 10.1109/tcc.2015.2440254